Apparatus for speech compression

ABSTRACT

Means for recording and selectively deleting portions of normal speech sound which includes a recorder for receiving and recording speech signals from an input, with a drive means being provided for the recorder, and with a power supply being provided for the drive means. A speech detector is coupled to the power supply for the drive means and is arranged to energize the drive means only in response to the presence of a speech signal in the input. A vowel detector is provided and is coupled to the drive means power supply for detecting the initiation and continuing presence of vowel sounds in speech signals. The vowel detector is adapted to regularly and periodically interrupt the drive means power supply for certain predetermined time intervals in response to the initiation and continued presence of vowel sounds in the input.

United States Patent Park, Jr. et al. 1 Mar. 27", 1973 54 APPARATUS FORSPEECH 3,532,821 10 1970 Nakata et al. ..179 1 SA COMPRESSION 3,428,7482/1969 Flanagan ..179 1 SA [75] Inventors: John H. Park, Jr., St. Paul;William a Examiner Ra mend F cardmo C M r M" 1' b :11 f y M' or meapo oo Attorney-Orrin M. Haugen inn.

[73] Assignee: PKM Corporation, St. Paul, Minn. ABSTRACT [22] Filed; Jam3, 1972 Means for recording and selectively deleting portions of normalspeech sound which includes a recorder for PP N04 214,615 receiving andrecording speech signals from an input, with a drive means beingprovided for the recorder, U'S. Cl. I R and a power supply beingpI'OVided for the drive 179/1 179/18 means. A speech detector is coupledto the power [51] int Cl Gnb 19/20 H66 supply for the drive means and isarranged to energize [58] Fie'ld 179/100 VC 1 VC the drive means only inresponse to the presence of a 15 55 speech signal in the input. A voweldetector is provided and is coupled to the drive means power supply fordetecting the initiation and continuing presence of [56] References cuedvowel sounds in speech signals. The vowel detector is UNITED STATESPATENTS adapted to regularly and periodically interrupt the drive meanspower supply for certain predetermined 2,411,501 ll/l946 Brubaker..l79/l00.l VC time intervals in response to the initiation and conn' glet g/ 3 92 tinued presence of vowel sounds in the input. u ey ..l7 52,115,803 5/1938 Dudley ..l79/l5.55 R 11 Claims, 13 Drawing Figures SETAMOUNT OF 22 23 VOWEL COMPRESSION LEVEL PRE- VOWEL VOWEL DRIVER POWERSUPPLY FILTER DETECTOR CHOPPER f29 J24 2'0 SPEECH COMPRESSION DETECTORMETER SPEECH PRE- $|GNAL AMPLIFIER r 25 PAUSE TONE 2'! INDICATORGENERATOR 28 TAPE TRANSPORT SET MINIMUM [with fast start/stop] PAUSERETAINED AND DRIVER RECORD k ELECTRONICS V I l SPEAKER 0R {so HEADPHONEREALI ZATION 0F SPEECH COMPRESSOR RELATIVE AMPLITUDE IN db RELATIVEAMPLITUDE IN db RELATIVE AMPLITUDE IN db PAIfNIEIIIIIRZTIUIS e -(23 7SHEET 2 [IF 7 FREQUENCY IN H2 -24 I L I I I I I I I 62.5 I25 250 500L000 2,000 4,000 8,000 I6,000

PRE-FILTER CHARACTERISTIC FIG 2 FR U N I 0 E CY N H:

I I I I I I I I L 62.5 I25 250 500 I,000 2,000'4,000 8,000 I6,000

SPECTRUM SHAPING FOR SPEECH DETECTOR FIG 3 I FREQUENCY IN H! -24 Y L I lI I I I I I I 62.5 I25 250 500 I,000 2,000 4,000 8,000 I5,000

SPECTRUM SHAPING FOR VOWEL DETECTOR FIG.4

PATENTEDHARZYIUIS 3,723,667

SHEET 3 UF 7 OUTPUT SPECTRUM SHAPING ENVELOPE DETECTOR THRESHOLDDETECTOR SPEECH DETECTOR FIG.5

OUTPUT V )L v M J SPECTRUM SHAPING ENVELOPE DETECTOR THRESHOLD DETECTORVOWEL DETECTOR FIG.6

PAIENIEDIIIRZ'IIUYS SIIEEI 4 BF 7 VOWEL PAUSE --0R|G|NA|. ELAPSED TIME,-|.l2 sec.

| CI P I I ORIGINAL I SPEECH IIIIII SPEECH DETECTOR OUTPUT VOWELDETECTOR OUTPUT II U U VOWEL CHOPPER OUTPUT RECORD DELETE CONTROL SIGNALI ELAPSED T|ME,-.54 see. I

I COMPRESSED SPEECH TIMING DIAGRAM FOR SPEECH COMPRESSION FIG.7

PATENTEDHARZYIUIS $723,667

SHEET 5 [IF 7 MANUAL INTERRUPT r LOGIC o I LEVEL I TAPE E SPEECH ONE-TRANSPORT o DETECTOR SHOT MINIMUM PAUSE LENGTH OF PAUSE DEiEETED'ADJUSTlNSERTED-ADJUST PLAYBACK SPEAK-ER OR ELECTRONICS HEADPHONE REALIZATIONOF SPEECH EXPANDER ELAPSED TIME ON RECORDING I I N SEC. I I RECORDEDSPEECH Il n I I I I I l I l l l I I U I SPEECH DETECTOR OUTPUT 1 l T ISTOP ACTION OF qwzzwLfi "I l PLAYIBACK IINSERTED: INSERTED: :INSERTED lPAUSE I l PAUSE I I PAUSE I EXP ED |r W 'r H SPEECH i I I ELAPSED TIMEON PLAYBACK 2.7 sec. 1

FIG 9 PATENTEDmzmra 3723,66?

sum 6 or 7 INPUT- OUTPUT TO ADJUST RI FOR 1 FOR 1 POWE I 2 SUPPLY VOWELCHOPPER FIGJO INPUT 0 OUTPUT FROM SPEECH CONTROL SIGNAL DETECTOR ADJUSTR FOR PORTION OF SPEECH EXPANDER FIG. 13

PATENTEUHARZ'IIUYS SHEET 7 BF 7 TONE GENERATOR +l7 SPEECH I00 SI8NAL 0F. seon PAUSE To D INDlCATOR (audio) \Nr 0 SPEECH IOM HEADPHONE OR ZLZSPEAKER AMPLIFIER all resistors Kohms all capacitors pf all transistors2N3392 unless specified PAUSE INDICATOR (visual) F I G. I I

PAUSE INDICATORS FULL SCALE ADJUST CONTROL 39K 25 K SGML 200mCOMPRESSION movem t INDICATOR COMPRESSION METER FIG.I2

APPARATUS FOR SPEECH COMPRESSION BACKGROUND OF THE INVENTION The presentinvention relates generally to a means for recording and compressingspeech sound, and more particularly to a method and apparatus forrecording and selectively deleting pauses as well as certain portions ofnormal speech sound from the recording. It has been found thatcontrolled and selective deletion of certain portions of normal speechrender the recorded message highly intelligible, even when compressed toa time of less than one-half of the actual speech.

Studies have indicated that the normal human ear and brain are rarely,if ever, overtaxed when listening to human speech at normal rates.Furthermore, studies have indicated that a normal listener is able tounderstand and comprehend speech even when delivered at a rate at least3 times as rapid as natural speech. Accordingly, in recording lectures,business memoranda, or the like, much time can be saved by compressingthe speech in terms of time, without deleting any significant portionsof the spoken words, or detracting from the intelligibility.

In the past, speech compression has been accomplished by means ofsystematic or periodic deletion of certain portions of the spokenmessage. Such a device is described in an article by Fairbanks, et al.,Method for Time or Frequency Compression-Expansion of Speech,Transactions of the I.R.E., PG on Audio, Vol.

AU-2, No. l, January-February, 1954, pages 7-11,

and achieves a time compression of the speech input by periodicallydiscarding a fixed segment of the input and bringing the ends of theretained input together to make a continuous, time-shortened signal. Ifthe length of the retained segment is sufficiently long with respect tothe fundamental pitch period of the voice, then the voice will retainmost of its natural quality. The length of deleted segment must besufficiently long with respect to the retained segment so as to effectthe desired or required time compression, but not so long so as toobscure the important transitional elements or consonants in speechwhich are normally of short duration. Inasmuch as the technique orpractice of bringing the ends of the retained segments together resultsin an apparent low-range frequency of the voice, the input medium musteither be played in at a faster than normal rate, or the alternative,the output must be arranged to be played back after processing at anincreased rate. The device described by Fairbanks et al. attains thenecessary frequency shifting by utilizing a rotating head assembly.

Other devices utilizing similar techniques may employ tapped delay linesin which the input is provided from tapes which are being sampled at asuitable rate to receive the desired shift and bring the ends of theretained segments together.

Those speech compression devices which utilize systematic or periodicdeletion of input suffer from a number of disadvantages. For example,those mechanical devices which utilize rotating head assemblies requirecareful adjustment and maintenance, and are considered complex andexpensive. Mechanical delay lines, which have been utilized in the past,are sensitive to mechanical shock. Electronic type delay lines have alsobeen utilized. Furthermore, the extent of time compression which can bederived from systematic deletion is limited to no less than about 60percent of the original time, since if additional compression isundertaken, the portion retained is such that many of the transitionalelements of the sound are either blurted or deleted, thereby reducingintelligibility.

The time compression which is obtained from systematic deletion isfrequently unnatural when compared with the normal human production ofrapid speech. Studies have shown that the normal speaker, whenattempting to speak more rapidly, will initially shorten the pausesbetween phonemes by bringing spoken sounds up more closely togetherwithout shortening the spoken sounds proportionally. It has been furtherfound that the shortening that does occur when the speaker is attemptingto speak at a more rapid rate takes place in the voiced or vowel-likesounds. It is believed that the transitional elements, particularlyunvoiced consonants, cannot be appreciably shortened in duration sincemanipulation of the vocal apparatus is more intricate and involved forthese sounds than for the longer vowel sounds. Accordingly, rapid humanspeech is characterized by shortened or minimal pauses along withshortened vowel-like sounds in the speech. To remain reasonablyintelligible, transitional elements including the unvoiced consonantsare shortened only very slightly, if at all.

It follows, therefore, that there is no reasonable relationship betweenthe normal or natural reactions of a speaker attempting to speak at amore rapid rate, and the technique of systematic deletion. It isappreciated, of course, that systematic deletion produces a result inwhich the pauses in the speech appear to be unnaturally long, and theconsonants unnaturally short, the combined effect of which renders thecompressed speech somewhat unintelligible.

SUMMARY OF THE INVENTION In-order to carry out the method of timecompression in accordance with the present invention, a tape recordingdevice is employed utilizing the speech input from a microphone,phonograph, tape recorder, or other similar structures which function inreal-time, with a time-compressed reproduction being produced which maybe played back on any standard playing apparatus. Essentially, thestructure includes a recording means for receiving and recording speechsignals from an input, with drive means being provided for the recordingmeans, and with a power supply being coupled to the drive means.Selective deletion of portions of the speech sound is accomplished by asubstantial elimination of pauses, as well as a means for eliminatingperiodic portions of vowel sounds. The apparatus of the presentinvention permits compression of speech to be undertaken to asubstantial degree, with intelligible results being obtained with acompression providing a resultant play-back time of less than about 30percent of the original speech time.

In addition to compression of speech, the apparatus of the presentinvention provides a means for expanding recorded speech as well. Priortechniques included the use of a slow play-back with a resultingfrequency shift, but changes in pitch make the speech unintelligible ifslow rates are employed. While systematic repetition of short segmentsof recorded speech may be utilized to preserve the pitch, the characterof such a recording is diminished because of apparent breaks in thespeech provided at arbitrary points. The apparatus of the presentinvention may function by selectively inserting additional pauses wherepauses will normally occur, and thus allow play-back at the recordedtime or greater, resulting in minimal, if any, loss in intelligibility.

Therefore, it is a primary object of the present invention to provide animproved speech compression apparatus which functions on the eliminationor drastic shortening of pauses, coupled with the deletion of certainportions of vowel or vowel-like sounds.

It is a further object of the present invention to provide an improvedapparatus for modifying speech timing including selective speechcompression and expansion which is simple in construction, rugged, andrelatively inexpensive.

It is yet a further object of the present invention to provide animproved speech compression apparatus which functions on the basis ofshortening or eliminating pauses, and deleting certain portions ofvowels or vowel-like sounds, this speech compression being accomplishedwith very little loss of intelligibility.

Other and further objects of the present invention will become apparentto those skilled in the art upon a study of the following specification,appended claims, and accompanying drawings.

BRIEF DESCRIPTION OF THE DRAWINGS FIG. 1 is a block diagram illustratingthe fundamental components utilized in a speech compressor apparatusprepared pursuant to the present invention;

FIG. 2 is a characteristic plot of frequency versus relative amplitudefor the pre-filter structure utilized in connection with the presentinvention;

FIG. 3 is a plot of the frequency versus the relative amplitude for thespectrum shaping apparatus of the speech detector portion of theapparatus;

FIG. 4 is a plot of the frequency versus relative amplitude for thespectrum shaping for the vowel detector;

FIG. 5 is a schematic diagram of a speech detector system which may beemployed in the apparatus utilized to practice the present invention,and capable of delivering a response curve similar to that shown in FIG.3;

FIG. 6 is a schematic diagram of the vowel detector which may beutilized to achieve the resultant curve shown in FIG. 4;

FIG. 7 is a typical timing diagram showing how speech-compression isachieved through a combination of pause deletion and vowel shortening;

FIG. 8 is a-block diagram of a speech expansion structure which may beutilized in connection with the apparatus of the present invention;

FIG. 9 is a timing diagram showing how speech expansion is attainedthrough the expander system shown in FIG. 8;

FIG. 10 is a schematic diagram of a vowel chopper which may be employedin connection with the present invention;

FIG. 1 l is a schematic diagram illustrating the pause indicator whichmay be utilized in connection with the apparatus of the presentinvention;

FIG. 12 is a compression (or expansion) meter which may be utilized inconnection with the apparatus of the present invention, and particularlyfor achieving an adjustable compression (or expansion) with a visualindication of the extent of compression; and

FIG. 13 is a schematic diagram of, a portion of the speech expanderconcept illustrated in FIGS. 8 and 9.

DESCRIPTION OF THE PREFERRED EMBODIMENT Attention is now directed toFIG. 1 of the drawings wherein the speech compressor apparatusfabricated pursuant to the present invention is illustrated in blockdiagram form. The system includes an input 20 which delivers a speechsignal to a preamplifier 21. The preamplified signal then passes to thepre-filter 22, and thence to a vowel detector system 23 and a speechdetector 24. The speech detector is, in turn, coupled to tape transport25, so as to interrupt flow of power to the tape transport uponoccurrence of a pause in the speech. The output of vowel detector 23 isdelivered to vowel chopper 26, and ultimately to tape transport 25 wherethe power supply for the tape transport is controllably regulated byvowel chopper 26.

As is indicated in the drawing of FIG. 1, the minimum pause to beretained may be adjustably preset in the speech detector. Also, theamount of vowel compression may be adjustably set in vowel chopper 26. Apause indicator, either visual or audible, such as is illustrated inFIG. 1 at 27 and 28 may also be employed if desired. Also, a visualindication of the compression occurring in the speech signal is providedas indicated at 29.

With continued attention being directed to FIG. 1 of the drawings, therecord electronic section 30 represents a bias oscillator, recordamplifier and record driver. The purpose of this portion of the systemis to supply the appropriate electrical signal to the record and eraseheads of a tape recorder, when a tape recorder is being employed. Suchelectronic systems are well known in the industry and are commerciallyavailable. The tape transport 25 is indicated as having a faststart/stop capability. This transport incorporates a read/write head,erase head, as well as drive means for moving the tape across the heads.In addition, a power supply is provided for the drive means, with thepower supply being actuated electrically for starting and stopping thetape. For a structure to be fully compatible with the various objectsand methods utilized in the present invention, the tape start-up timefrom full stop to full speed should be no greater than about 40milliseconds for pause shortening operations, and no greater than about20 milliseconds for vowel shortening. Start-up times of about 30milliseconds and 10 milliseconds respectively are preferred.Furthermore, the stop time from full speed to full stop must besubstantially the same. Tape transports having such start/stopcapabilities are commercially available, and are widely used in theelectronic data processing industry.

As can be appreciated, an important feature of the present invention isthe generation of a control signal for the power supply to control thedrive means for the recording mechanisms. As indicated, this signal isbased on pause elimination and vowel shortening.

As is indicated in FIG. 1, speech signals are recorded by way of thetape transport whenever the control signal is on. Such a signal existswhenever an appropriate voltage or current level is available to placethe transport in the operational mode. When a speech signal is notpresent, the control signal will not be present and the transport willnot be moving the tape. When a speech signal is detected and it is not avowel sound, then the transport is operative and tape is being carriedacross the record head. When a speech signal is present and it isascertained that it is a vowel sound, then a first predetermined portionof the sound is recorded, and thereafter the sound is recorded on aperiodic, cyclic, or chopped" basis. For example, a vowel sound isrecorded for the first 1 seconds, and for the next t seconds, the soundis not being recorded. Thereafter, if the vowel sound is continuing, thenext t seconds are recorded, followed by a period of t seconds of norecording. This cycle continues until the speech sound becomes anon-vowel in which case it is fully recorded, or, in the alternative,until the speech signal is no longer present in which case the powersupply is interrupted and the transport stops.

With continued attention being directed to FIG. 1, the input signal,derived from a microphone, tape head, phonograph, radio or othertransducer providing an electrical signal representing the speech sound.This signal is initially amplified in the preamplifier 21 to bring it upto standard levels, such as, for example, a peak at O-VU at the recordhead. In order to reduce noise and other unwanted signals which have afrequency spectrum falling outside of the voice spectrum, the signal ispreferably filtered. It has been found that the filter utilized shouldhave the characteristics shown in the diagram of FIG. 2, withfrequencies below about 250 Hz being reduced to eliminate hum andrumble, and to insure that the envelope detector doesnot follow thenatural pitch-period resonance of certain speakers. Furthermore,frequencies substantially. above approximately 6,000 B2 are reduced oreliminated in order to minimize the effects of hiss and background roomnoise. This filtered signal is then passed into the vowel detector andspeech detector, as indicated.

Attention is now directed to FIG. 5 of the drawings wherein a typicalspeech detector system is illustrated. This detector includes componentsfor accomplishing three basic functions, spectrum shaping, envelopedetection, and threshold detection. Spectrum shaping is necessary inorder that low energy speech sounds necessary for good intelligibility,are weighted the same as the high energy vowel sounds. The weightingshown in FIG. 3 of the drawings has been found to provide a nearly flatspectrum at the output of the spectrum shaping circuit for mostspeakers. After spectrum shaping, the resulting signal is detected asindicated. Capacitor 35 charges rapidly when'speech energy is present,and when the voltage reaches a threshold (about 2 volts for the circuitshown), the output signal goes to a logical level indicating speechbeing present. When a pause occurs, transistor 36 is turned off, and thecharge on capacitor 35 discharges through variable resistor 37. When thevoltage falls below a second threshold, in this case about 0.7 volts,the output signal immediately drops to a level indicating speech beingabsent. The circuit indicates that the time to reach this thresholddetermines the length of the pause that is retained, and accordinglyadjustment of variable resistor 37 may be utilized to control this time.In the circuit illustrated in FIG. 5, it is simple to utilize times asshort as 10 milliseconds or less, or utilize times as long as l0s ofseconds or even longer. When a signal is again present, capacitor 35charges and an output is indicated.

Attention is now directed to FIG. 6 of the drawings wherein theschematic illustration of the vowel detector is shown. It is known thatvowel sounds have their primary energy (first formants) between about250 and 800 Hz. Most consonants have their primary energy in frequenciesabove approximately 1,000 I-Iz. Accordingly, voice signals are filteredby the vowel spectrum selector, the circuit shown in FIG. 6. This filterhas the characteristic as is indicated in FIG. 4, and provides energy inthe area of between about 250 Hz and 800 Hz. The output of this filterwill provide consonant sounds having voltage levels that are 30 db orlower in intensity than vowel sounds. The envelope detector andthreshold device operate similarly to the speech detector discussedabove, with one important difference being that when a vowel sound ends,the circuit is designed so that the no-vowel level appears at the outputwithin less than about 20 milliseconds delay. It is, of course,necessary to retain a portion of the vowel sound, hence the output ofthe vowel detector goes to the vowel chopper shown in FIG. 10. Thepurpose of this circuit of FIG. 10 is to produce an output level for thepower supply to the drive means for a period of t seconds, and interruptthis power for the next succeeding seconds alternately as is illustratedin FIG. 7 until the vowel sounds terminate. When the vowel soundsterminate, the output again returns to a level indicating no vowelspresent. This function insures that consonants occurring immediatelyafter a vowel sound are not lost. The system illustrated in FIG. 10consists of two one-shot multivibrators and several logic gates. Thetime constant R C, in the first one-shot multivibrator determines thetime period for t and the time constant R, C, in the second one-shotmultivibrator determines the timeperiod t,. The percent of the vowelsound that is deleted is, of course, equivalent to t /(t, t )X 100. Thetime I, should be chosen to conmin at least several cycles of the lowestresonant voice sound anticipated for the device, this frequencytypically being on the order of 100 Hz, and accordingly having a periodof 10 milliseconds. Hence t should be at least about 30 milliseconds. Onthe other hand, t, should be smaller than the shortest vowel sounds sosome shortening will, in fact, occur. In general, vowel sounds areseldom shorter than about milliseconds for most speakers. Thus, the timet, is selected in conjunction with t, in order to obtain thedesiredvowel noted that the output of the vowel chopper and the speechdetector are arranged in AND configuration to form the control signal tothe drive means power supply. This is illustrated in FIG. 7. Thus, thecontrol signal is off whenever speech is absent or during the time twhen vowels are present in the speech signal. This control signalactivates the tape recorder sothat the signal derived from the voweldetector and its chopper element, together with the speech detector, areutilized to activate the recorder and interrupt or stop the recorder asappropriate. It will be appreciated that any style of recorder may beutilized, including magnetic tape, wire, disc, or the like, the primaryrequirement being that it have a capability of starting and stoppingrapidly, as indicated hereinabove. The signal level to the system is setat the preamplifier 21, as indicated, so that the recorder peaks areapproximately at O-VU, as a standard practice. The preamplifier is, ofcourse, a standard type structure which is commer cially available. Thelevel set into the controller determines the signallevels that willactivate the speech and vowel detectors. Thus, when the background noiseis of low volume (40 db below O-VU), this level can be set so thatsignals as low as 30 db below O-VU activate the speech and voweldetectors. When the background noise increases so as to achieve a levelof about 20 db below O-VU, this level must be set so that such noisedoes not trigger the speech and vowel detectors, for example, thearrangement being such that only signals at db below O-VUor greater willtrigger the speech and vowel detectors.

In order to facilitate the setting of the level control and the pauselength control, it is, of course, desirable to have visual and audiblesignals to indicate times when the speech detector output is off. Atechnique to accomplish such an arrangement is illustrated in FIG. 11.As can be seen from the schematic of FIG. 11, the light driver isactivated to light a lamp when the speech indicator is off. Also, anaudible tone may be generated utilizing the oscillator as illustrated.It will be appreciated that any form of oscillator will suffice forgenerating an audible tone. When no signal is present at the output ofthe speech detector, the oscillator will be activated so as to generatetheaudible tone at this time. The resulting tone is available by way ofa speaker or head phone to the operator. This arrangement is illustratedin FIG. 1, wherein this indicated function is added to the incomingvoice signal, and hence played through the monitor speaker or headphone. At this time, the operator may simultaneously monitor what isbeing recorded and the indication of what portions are to be deleted dueto the function of the speech detector and its affect on the powersupply for the driver means.

Another feature of the present invention is the use of the structure forspeech expansion. FIG. 9 illustrates a timing diagram showing the methodof approach for speech expansion. The speech signal which is beingplayed from a recorded medium is monitored utilizing the speechdetector, and when speech is absent, as indicatedby the detector, acontrol signal is generated which stops the play-back of the recordedsignal for a period of time t,, whereupon play-back resumes. Theplay-back continues until the speech detector goes from a speechindication level to a speech absence level, whereupon the process isrepeated. One method of realizing this method of speech expansionpursuant to the present invention is shown in the block diagram of FIG.8. In this embodiment, a tape transport is in the play-back mode and thesignal to be expanded is recorded on a magnetic tape. The tape headpicks up the recorded speech signal and on the one hand it is passedthrough the usual play-back electronics, and presented to the listenerby way of a speaker or head phone. On the other hand, it is also playedinto-the speech detector described in detail hereinabove, whereupon theoutput of the speech detector is on when speech is present, and off whenspeech is absent. When this output signal ceases, a one-shotmultivibrator is triggered which produces a control signal. Normally theoutput of this one-shot multivibrator indicates that the transport is inthe operational mode. When the speech detector output goes from anindication of speech to no-speech, the one-shot multivibrator istriggered and the control signal is lost, with the transport stoppingfor a period of seconds, after which the transport resumes normalplay-back until the speech detector output again falls to a no-speechlevel, whereupon the process is repeated.

A possible method of generating the interval of time of t, seconds isshown in FIG. 13. Two methods are provided foradjusting the amount ofexpansion. The first of these is by changing the time constant R C inFIG. 13, thus changing the time It is appreciated that with thiscircuit, one is able to vary t from as low as about 20 milliseconds toas long as several seconds or more. Of course, the longer t the more thespeech is expanded- The second method of varying the amount of expansionis simply by adjusting the minimum pause before the speech detectorindicates a condition of no speech. This is accomplished by adjusting RC of FIG. 5. If this. time constant is sufficiently long, then shortpauses will not be detected and hence not expanded, and the amount ofexpansion will be decreased. If even the very shortest pauses aredetected (R C of FIG. 5)

will be very small, and in this case there will be a greater amount ofexpansion.

As has been indicated, the drive means and power supply for therecording means are standard and conventional in the art. Obviously,battery or AC driven units may be employed. The pause elimination andvowel shortening occurs by means of controlling the current flow fromthe power supply to the drive means.

We claim:

1. Means for recording and selectively deleting portions of normalspeech sound comprising:

a. input means, recording means for receiving and recording speechsignals from said input means, drive means for said recording means, anda power supply delivering energyto said drive means;

. speech detector means coupled to said drive means power supply fordetecting the presence of a speech signal in said input means and forenergizing said drive means power supply only in response to thepresence of a speech signal therein;

c. vowel detector means coupled to said drive means power supply fordetecting the initiation and continuing presence of vowel sounds inspeech signals in said input means, said vowel detector means beingadapted to regularly and periodically interrupt said drive means powersupply for certain predetermined time intervals in response to theinitiation and continued presence of vowel sounds in said input, withmeans being provided for periodically chopping said power supply into aplurality of substantially regularly spaced apart power pulses havingpredetermined time duration, with said periodic chopping of drive meanspower supply commencing after a certain predetermined time intervalfollowing initial detection of vowel presence and continuing during thepresence of vowel sounds in said input.

2. The speech compression means as defined in claim 1 being particularlycharacterized in that filter means are provided in the speech input forpassing signals of between about 250 Hz and 6,000 Hz.

3. The speech compression means as defined in claim 1 being particularlycharacterized in that said periodic chopping of drive means power supplyprovides for power pulses of about 60 milliseconds followed by an idleperiod of about 30 milliseconds.

4. The speech compression means as defined in claim 1 being particularlycharacterized in that said recording means has a start up timecapability of less than about 10 milliseconds.

5. The speech compression means as defined in claim 1 being particularlycharacterized in that said speech detector means continues to energizesaid drive means power supply for a predetermined period of time greaterthan approximately 10 milliseconds following the termination of eachspeech signal.

6. The recording means as defined in claim 1 being particularlycharacterized in that filter means are provided for speech detection,the filter being adapted to pass signals of modest amplitude atfrequencies less than about 1,000 Hz, with the amplitude increasingsubstantially uniformly until an input frequency of about 8,000 Hz isreached.

7. The recording means as defined in claim 6 being particularlycharacterized in that said increase is at a level of about 24 db./octaveat frequencies from between 1,000 Hz and 8,000 Hz.

8. The recording means as defined in claim 1 being particularlycharacterized in that vowel detector means are provided in the speechinput for passing signals having a frequency of between about 250- Hzand 1,200 Hz.

9. The speech compression means as defined in claim 1 being particularlycharacterized in that control means are provided for controllablyadjusting the extent of compression.

10. Means for recording and selectively modifying portions of normalspeech sound comprising:

a. input means, recording means for receiving and recording speechsignals from said input means,

drive means for said recording means, and a power predetermined timeintervals in response to the inmatron and continued presence of vowelsounds in said input, with means being provided for periodicallychopping said power supply into a plurality of substantially regularlyspaced apart power pulses having predetermined time duration, with saidperiodic chopping of drive means power supply commencing after a certainpredetermined time interval following initial detection of vowelpresence and continuing during the presenceof vowel sounds in saidinput; and means for selectively continuing the energization of saiddrive means for predetermined periods of timeupon detection oftermination of the presence of a speech signal in said speech detectormeans. 11. The speech compression means as defined in claim 10 beingparticularly characterized in that said recording means includes firstand second serially coupled recording means with drive means for each ofsaid recording means, means for continuing the energization of saidsecond recording means upon each occurrence of the termination of thepresence of a speech signal in said first recording means.

1. Means for recording and selectively deleting portions of normalspeech sound comprising: a. input means, recording means for receivingand recording speech signals from said input means, drive means for saidrecording means, and a power supply delivering energy to said drivemeans; b. speech detector means coupled to said drive means power supplyfor detecting the presence of a speech signal in said input means andfor energizing said drive means power supply only in response to thepresence of a speech signal therein; c. vowel detector means coupled tosaid drive means power supply for detecting the initiation andcontinuing presence of vowel sounds in speech signals in said inputmeans, said vowel detector means being adapted to regularly andperiodically interrupt said drive means power supply for certainpredetermined time intervals in response to the initiation and continuedpresence of vowel sounds in said input, with means being provided forperiodically chopping said power supply into a plurality ofsubstantially regularly spaced apart power pulses having predeterminedtime duration, with said periodic chopping of drive means power supplycommencing after a certain predetermined time interval following initialdetection of vowel presence and continuing during the presence of vowelsounds in said input.
 2. The speech compression means as defined inclaim 1 being particularly characterized in that filter means areprovided in the speech input for passing signals of between about 250 Hzand 6,000 Hz.
 3. The speech compression means as defined in claim 1being particularly characterized in that said periodic chopping of drivemeans power supply provides for power pulses of about 60 millisecondsfollowed by an idle period of about 30 milliseconds.
 4. The speechcompression means as defined in claim 1 being particularly characterizedin that said recording means has a start up time capability of less thanabout 10 milliseconds.
 5. The speech compression means as defined inclaim 1 being particularly characterized in that said speech detectormeans continues to energize said drive means power supply for apredetermined period of time greater than approximately 10 millisecondsfollowing the termination of each speech signal.
 6. The recording meansas defined in claim 1 being particularly characterized in that filtermeans are provided for speech detection, the filter being adapted topass signals of modest amplitude at frequencies less than about 1,000Hz, with the amplitude increasing substantially uniformly until an inputfrequency of about 8,000 Hz is reached.
 7. The recording means asdefined in claim 6 being particularly characterized in that saidincrease is at a level of about 24 db./octave at frequencies frombetween 1,000 Hz and 8,000 Hz.
 8. The recording means as defined inclaim 1 being particularly characterized in that vowel detector meansare provided in the speech input for passing signals having a frequencyof between about 250 Hz and 1,200 Hz.
 9. The speech compression means asdefined in claim 1 being particularly characterized in that controlmeans are provided for controllably adjusting the extent of compression.10. Means for recording and selectively modifying portions of normalspeech sound comprising: a. input means, recording means for receivingand recording speech signals from said input means, drive means for saidrecording means, and a power supply delivering energy to said drivemeans; b. speech detector means coupled to said drive means power supplyfor detecting the presence of a speech signal in said input means andfor energizing said drive means power supply only in response to thepresence of a speech signal therein; c. vowel detector means coupled tosaid drive means power supply for detecting the initiation anDcontinuing presence of vowel sounds in speech signals in said inputmeans, said vowel detector means being adapted to regularly andperiodically interrupt said drive means power supply for certainpredetermined time intervals in response to the initiation and continuedpresence of vowel sounds in said input, with means being provided forperiodically chopping said power supply into a plurality ofsubstantially regularly spaced apart power pulses having predeterminedtime duration, with said periodic chopping of drive means power supplycommencing after a certain predetermined time interval following initialdetection of vowel presence and continuing during the presence of vowelsounds in said input; and d. means for selectively continuing theenergization of said drive means for predetermined periods of time upondetection of termination of the presence of a speech signal in saidspeech detector means.
 11. The speech compression means as defined inclaim 10 being particularly characterized in that said recording meansincludes first and second serially coupled recording means with drivemeans for each of said recording means, means for continuing theenergization of said second recording means upon each occurrence of thetermination of the presence of a speech signal in said first recordingmeans.